Safe Screening for Multi-Task Feature Learning with Multiple Data Matrices
نویسندگان
چکیده
Multi-task feature learning (MTFL) is a powerful technique in boosting the predictive performance by learning multiple related classification/regression/clustering tasks simultaneously. However, solving the MTFL problem remains challenging when the feature dimension is extremely large. In this paper, we propose a novel screening rule—that is based on the dual projection onto convex sets (DPC)—to quickly identify the inactive features—that have zero coefficients in the solution vectors across all tasks. One of the appealing features of DPC is that: it is safe in the sense that the detected inactive features are guaranteed to have zero coefficients in the solution vectors across all tasks. Thus, by removing the inactive features from the training phase, we may have substantial savings in the computational cost and memory usage without sacrificing accuracy. To the best of our knowledge, it is the first screening rule that is applicable to sparse models with multiple data matrices. A key challenge in deriving DPC is to solve a nonconvex problem. We show that we can solve for the global optimum efficiently via a properly chosen parametrization of the constraint set. Moreover, DPC has very low computational cost and can be integrated with any existing solvers. We have evaluated the proposed DPC rule on both synthetic and real data sets. The experiments indicate that DPC is very effective in identifying the inactive features—especially for high dimensional data—which leads to a speedup up to several orders of magnitude. Proceedings of the 32 International Conference on Machine Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copyright 2015 by the author(s).
منابع مشابه
Online Multi-Task Learning via Sparse Dictionary Optimization
This paper develops an efficient online algorithm for learning multiple consecutive tasks based on the KSVD algorithm for sparse dictionary optimization. We first derive a batch multi-task learning method that builds upon K-SVD, and then extend the batch algorithm to train models online in a lifelong learning setting. The resulting method has lower computational complexity than other current li...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملOnline Multi-Task Learning based on K-SVD
This paper develops an efficient online algorithm based on K-SVD for learning multiple consecutive tasks. We first derive a batch multi-task learning method that builds upon the K-SVD algorithm, and then extend the batch algorithm to train models online in a lifelong learning setting. The resulting method has lower computational complexity than other current lifelong learning algorithms while m...
متن کاملSaliency Detection by Multi-Task Sparsity Pursuit(Double Column).dvi
Saliency Detection by Multi-Task Sparsity Pursuit Congyan Lang, Guangcan Liu, Member, IEEE, Jian Yu, and Shuicheng Yan, Senior Member, IEEE, Abstract—This paper addresses the problem of detecting salient areas within natural images. We shall mainly study the problem under unsupervised setting, namely saliency detection without learning from labeled images. A solution of multi-task sparsity purs...
متن کامل